RISCV-BOOM

Questions to answer on BOOM

ACA Materials

Taking this
  1. Topics: • How the latest microprocessors work • Why they are built that way – and what are the alternatives? • How you can make software that uses the hardware in the best possible way • How you can make a compiler that does it for you • How you can design a computer for your problem • What does a big computer look like? • What are the fundamental big ideas and challenges in computer architecture? • What is the scope for theory?
  2. Have a look at past papers:
    • Some of the exam questions will be based on the article studied in class
    • In the exams - more of an essay ~> Clear and informed thinking about the question at hand
    • 20% coursework
  3. Turing award speech from Henessy and Paterson.
  4. John Backus “Can Programming be Liberated from the von Neumann Style?” (1979) (Have a read)
  1. Opcode is always at the same position in the instruction
  2. Source register is \\
  3. Branch immediate is a memory offset from the current instruction
  4. The “Turing Tax” is a term for the overhead (performance, cost, or energy) of universality in this sense
  5. ASIC - Application-specific integrated circuit
Proper dive 2019-10-15 16:05
  1. Improving average memory access time:
    • AMAT is a function of hits and misses (hit time + MissRate * MissPenalty)
  2. Types of cache misses:
    • Compulsory - fresh data
    • Capacity - data was evicted because we loaded new data
    • Conflict - Cache is not full, but we still evicted a line because we inserted a line that had a conflicting index
    • Coherence - data invalidated by another processor or device
  3. Wave speculation (Textbook)
  4. Victim Cache -- Search the main cache and the victim cache at the same time:
    • A small cache for things that are being thrown away from the main cache
    • Check the victim cache in parallel with the cache
  5. Skewed-associative caches are computing hashes of tags and use the hashes as indices:
    • Could be worse as the worst case scenario is more difficult to avoid
  6. Hardware Prefetching:
    • As soon as we have a cache miss, we initiate a fetch for the next block
    • Similar to victim cache, since it has a side-cache, but the dataflow is reversed
    • Always check the stream buffer in parallel with the cache
    • Prefetch n+5 cache lines (enough to cover the access latency)

Reading:

  • The Microarchitecture of the Pentium 4 Processor (Hinton et al, Intel Tech Jnl Q1 2001)
  • The SimpleScalar Tool Set, Version 2.0 (Burger and Austin, http://www.simplescalar.com/docs/users_guide_v2.pdf)
  • Wattch: a framework for architectural-level power analysis and optimizations (Brooks et al, ISCA 2000) www.tortolaproject.com/papers/brooks00wattch.pdf

• Papers: – Instruction issue logic for high-performance, interruptable pipelined processors. G. S. Sohi, S. Vajapeyam. International Conference on Computer Architecture, 1987 (http://doi.acm.org/10.1145/30350.30354) – Towards Kilo-instruction processors. Cristal, Santana, Valero, Martinez ACM Trans. Architecture and Code Optimization (http://doi.acm.org/10.1145/1044823.1044825) • Other simulators: – Simplescalar: www.simplescalar.com/ – Gem5: http://www.gem5.org – Liberty: http://liberty.cs.princeton.edu/ – SimFlex: http://parsa.epfl.ch/simflex/ – SIMICS: http://www.windriver.com/products/simics/

ACA_CW1

ACA CW2